F: Regression Models over Factorized Views

نویسندگان

  • Dan Olteanu
  • Maximilian Schleich
چکیده

We demonstrate F, a system for building regression models over database views. At its core lies the observation that the computation and representation of materialized views, and in particular of joins, entail non-trivial redundancy that is not necessary for the efficient computation of aggregates used for building regression models. F avoids this redundancy by factorizing data and computation and can outperform the state-of-the-art systems MADlib, R, and Python StatsModels by orders of magnitude on real-world datasets. We illustrate how to incrementally build regression models over factorized views using both an in-memory implementation of F and its SQL encoding. We also showcase the effective use of F for model selection: F decouples the datadependent computation step from the data-independent convergence of model parameters and only performs once the former to explore the entire model space.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental View Maintenance with Triple Lock Factorization Benefits

We introduce F-IVM, a unified incremental view maintenance (IVM) approach for a variety of tasks, including gradient computation for learning linear regression models over joins, matrix chain multiplication, and factorized evaluation of conjunctive queries. F-IVM is a higher-order IVM algorithm that reduces the maintenance of the given task to the maintenance of a hierarchy of increasingly simp...

متن کامل

FOLS: Factorized Orthogonal Latent Spaces

Many machine learning problems inherently involve multiple views. Kernel combination approaches to multiview learning [1] are particularly effective when the views are independent. In contrast, other methods take advantage of the dependencies in the data. The best-known example is Canonical Correlation Analysis (CCA), which learns latent representations of the views whose correlation is maximal...

متن کامل

Factorized Latent Spaces with Structured Sparsity

Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach t...

متن کامل

Factorization Machines Factorized Polynomial Regression Models

Factorization Machines (FM) are a new model class that combines the advantages of polynomial regression models with factorization models. Like polynomial regression models, FMs are a general model class working with any real valued feature vector as input for the prediction of real-valued, ordinal or categorical dependent variables as output. However, in contrast to polynomial regression models...

متن کامل

Incremental Maintenance of Regression Models over Joins

This paper introduces a principled incremental view maintenance (IVM) mechanism for in-database computation described by rings. We exemplify our approach by introducing the covariance matrix ring that we use for learning linear regression models over arbitrary equi-join queries. Our approach is a higher-order IVM algorithm that exploits the factorized structure of joins and aggregates to avoid ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2016